Final Project - Indicators of Anxiety or Depression Based on Reported Frequency of Symptoms During Last 7 Days

Author

Ian Walsh & Logan Rosell

Published

November 11, 2025

Reseach Question: How did anxiety and depression levels differ between states following the outbreak of COVID-19 in the United States?

Data Cleaning

Import libraries and dataset

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px

df = pd.read_csv("./Datasets/Indicators_of_Anxiety_or_Depression_Based_on_Reported_Frequency_of_Symptoms_During_Last_7_Days.csv")

df.head()
Indicator Group State Subgroup Phase Time Period Time Period Label Time Period Start Date Time Period End Date Value Low CI High CI Confidence Interval Quartile Range
0 Symptoms of Depressive Disorder National Estimate United States United States 1 1 Apr 23 - May 5, 2020 04/23/2020 05/05/2020 23.5 22.7 24.3 22.7 - 24.3 NaN
1 Symptoms of Depressive Disorder By Age United States 18 - 29 years 1 1 Apr 23 - May 5, 2020 04/23/2020 05/05/2020 32.7 30.2 35.2 30.2 - 35.2 NaN
2 Symptoms of Depressive Disorder By Age United States 30 - 39 years 1 1 Apr 23 - May 5, 2020 04/23/2020 05/05/2020 25.7 24.1 27.3 24.1 - 27.3 NaN
3 Symptoms of Depressive Disorder By Age United States 40 - 49 years 1 1 Apr 23 - May 5, 2020 04/23/2020 05/05/2020 24.8 23.3 26.2 23.3 - 26.2 NaN
4 Symptoms of Depressive Disorder By Age United States 50 - 59 years 1 1 Apr 23 - May 5, 2020 04/23/2020 05/05/2020 23.2 21.5 25.0 21.5 - 25.0 NaN

Filter to only state data and drop unnecessary columns;Group and Subgroup are redundant, Time period and CI are just combinations of other column’s data.

state_data = df[df['Group']=='By State']

state_data.drop(columns = ['Group', 'Subgroup', 'Time Period Label', 'Confidence Interval'], inplace = True)
C:\Users\ianwa\AppData\Local\Temp\ipykernel_17852\4290850861.py:3: SettingWithCopyWarning:


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

Seperate Quartile Range into 2 Columns:

state_data[['Quartile_Lower', 'Quartile_Upper']] = state_data['Quartile Range'].str.split(' - ', expand=True)
state_data.drop(columns='Quartile Range')
C:\Users\ianwa\AppData\Local\Temp\ipykernel_17852\3542449001.py:1: SettingWithCopyWarning:


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

C:\Users\ianwa\AppData\Local\Temp\ipykernel_17852\3542449001.py:1: SettingWithCopyWarning:


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
Indicator State Phase Time Period Time Period Start Date Time Period End Date Value Low CI High CI Quartile_Lower Quartile_Upper
19 Symptoms of Depressive Disorder Alabama 1 1 04/23/2020 05/05/2020 18.6 14.6 23.1 16.5 20.7
20 Symptoms of Depressive Disorder Alaska 1 1 04/23/2020 05/05/2020 19.2 16.8 21.8 16.5 20.7
21 Symptoms of Depressive Disorder Arizona 1 1 04/23/2020 05/05/2020 22.4 19.4 25.5 22.2 24.0
22 Symptoms of Depressive Disorder Arkansas 1 1 04/23/2020 05/05/2020 26.6 22.3 31.3 24.1 28.7
23 Symptoms of Depressive Disorder California 1 1 04/23/2020 05/05/2020 25.4 22.5 28.6 24.1 28.7
... ... ... ... ... ... ... ... ... ... ... ...
14368 Symptoms of Anxiety Disorder or Depressive Dis... Virginia 3.10 62 09/20/2023 10/02/2023 33.2 29.8 36.7 30.7-33.5 None
14369 Symptoms of Anxiety Disorder or Depressive Dis... Washington 3.10 62 09/20/2023 10/02/2023 34.3 31.0 37.7 33.6-36.2 None
14370 Symptoms of Anxiety Disorder or Depressive Dis... West Virginia 3.10 62 09/20/2023 10/02/2023 44.7 40.0 49.4 36.3-44.7 None
14371 Symptoms of Anxiety Disorder or Depressive Dis... Wisconsin 3.10 62 09/20/2023 10/02/2023 30.4 27.1 33.9 24.5-30.6 None
14372 Symptoms of Anxiety Disorder or Depressive Dis... Wyoming 3.10 62 09/20/2023 10/02/2023 36.3 30.1 42.9 36.3-44.7 None

9486 rows × 11 columns

Clean up the Phase column:

state_data['Phase'].unique()
# There are 2 values that contain dates which are already stored in other columns, so we can remove these dates

state_data['Phase'] = state_data['Phase'].str.split(' ', expand = True).get(0)
C:\Users\ianwa\AppData\Local\Temp\ipykernel_17852\3952733264.py:4: SettingWithCopyWarning:


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

Change Data Types as needed

state_data['Indicator'] = pd.Categorical(state_data['Indicator'], categories = ['Symptoms of Depressive Disorder', 'Symptoms of Anxiety Disorder', 'Symptoms of Anxiety Disorder or Depressive Disorder'])

state_data['Phase'] = pd.Categorical(state_data['Phase'], categories=['1', '2', '3', '3.1', '3.2', '3.3', '3.4', '3.5', '3.6', '3.7', '3.8', '3.9', '3.10'])

state_data['Time Period Start Date'] = pd.to_datetime(state_data['Time Period Start Date']).dt.date
state_data['Time Period End Date'] = pd.to_datetime(state_data['Time Period End Date']).dt.date
C:\Users\ianwa\AppData\Local\Temp\ipykernel_17852\2391328660.py:1: SettingWithCopyWarning:


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

C:\Users\ianwa\AppData\Local\Temp\ipykernel_17852\2391328660.py:3: SettingWithCopyWarning:


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

C:\Users\ianwa\AppData\Local\Temp\ipykernel_17852\2391328660.py:5: SettingWithCopyWarning:


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

C:\Users\ianwa\AppData\Local\Temp\ipykernel_17852\2391328660.py:6: SettingWithCopyWarning:


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

EDA

Looking at some graphs

# Histogram of values for all states
plt1 = sns.histplot(state_data, x='Value', hue = 'Indicator', alpha = 0.5)
plt.title('Histogram of Value by Indicator')
plt.show()

# Pair Plot
pair_plot = sns.pairplot(state_data[['Indicator','Value','Time Period']], hue = 'Indicator')
plt.show()

national_avgs = state_data.groupby(['Time Period Start Date', 'Indicator']).agg(
    nat_means = ('Value', 'mean')
)
nat_avg_plt = sns.lineplot(national_avgs,
                            x='Time Period Start Date',
                            y='nat_means',
                            hue = 'Indicator')
plt.xticks(rotation=45)
plt.title(f"Values Over Time")
plt.ylabel('Values')
plt.show()
C:\Users\ianwa\AppData\Local\Temp\ipykernel_17852\477478972.py:1: FutureWarning:

The default of observed=False is deprecated and will be changed to True in a future version of pandas. Pass observed=False to retain current behavior or observed=True to adopt the future default and silence this warning.

state_code_map = {
    "Alabama": "AL",
    "Alaska": "AK",
    "Arizona": "AZ",
    "Arkansas": "AR",
    "California": "CA",
    "Colorado": "CO",
    "Connecticut": "CT",
    "Delaware": "DE",
    "Florida": "FL",
    "Georgia": "GA",
    "Hawaii": "HI",
    "Idaho": "ID",
    "Illinois": "IL",
    "Indiana": "IN",
    "Iowa": "IA",
    "Kansas": "KS",
    "Kentucky": "KY",
    "Louisiana": "LA",
    "Maine": "ME",
    "Maryland": "MD",
    "Massachusetts": "MA",
    "Michigan": "MI",
    "Minnesota": "MN",
    "Mississippi": "MS",
    "Missouri": "MO",
    "Montana": "MT",
    "Nebraska": "NE",
    "Nevada": "NV",
    "New Hampshire": "NH",
    "New Jersey": "NJ",
    "New Mexico": "NM",
    "New York": "NY",
    "North Carolina": "NC",
    "North Dakota": "ND",
    "Ohio": "OH",
    "Oklahoma": "OK",
    "Oregon": "OR",
    "Pennsylvania": "PA",
    "Rhode Island": "RI",
    "South Carolina": "SC",
    "South Dakota": "SD",
    "Tennessee": "TN",
    "Texas": "TX",
    "Utah": "UT",
    "Vermont": "VT",
    "Virginia": "VA",
    "Washington": "WA",
    "West Virginia": "WV",
    "Wisconsin": "WI",
    "Wyoming": "WY",
    "District of Columbia": "DC",
    "American Samoa": "AS",
    "Guam": "GU",
    "Northern Mariana Islands": "MP",
    "Puerto Rico": "PR",
    "United States Minor Outlying Islands": "UM",
    "Virgin Islands, U.S.": "VI",
}

indicators = state_data['Indicator'].unique()
color_scales = ['blues','amp','purp']

for i,j in zip(indicators,color_scales):
    fig_data = state_data[(state_data['Indicator'] == i)]

    fig_data['State_Code'] = fig_data['State'].map(state_code_map)

    max = fig_data['Value'].max()
    min = fig_data['Value'].min()

    fig = px.choropleth(
        fig_data,
        locations='State_Code',
        locationmode='USA-states',
        color='Value',
        scope='usa',
        title=f'Map of {i} in US states',
        hover_name='State',
        color_continuous_scale=j,
        animation_frame='Time Period Start Date',
        range_color=[min,max]
    )
    fig.show()
C:\Users\ianwa\AppData\Local\Temp\ipykernel_17852\1124955677.py:67: SettingWithCopyWarning:


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
C:\Users\ianwa\AppData\Local\Temp\ipykernel_17852\1124955677.py:67: SettingWithCopyWarning:


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
C:\Users\ianwa\AppData\Local\Temp\ipykernel_17852\1124955677.py:67: SettingWithCopyWarning:


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy